UKP at CrossLink: Anchor Text Translation for Cross-lingual Link Discovery
نویسندگان
چکیده
This paper describes UKP’s participation in the cross-lingual link discovery (CLLD) task at NTCIR-9. The given task is to find valid anchor texts from a new English Wikipedia page and retrieve the corresponding target Wiki pages in Chinese, Japanese, and Korean languages. We have developed a CLLD framework consisting of anchor selection, anchor ranking, anchor translation, and target discovery subtasks, and discovered anchor texts from English Wikipedia pages and their corresponding targets in Chinese, Japanese, and Korean languages. For anchor selection, anchor ranking, and target discovery, we have largely utilized the state-ofthe-art monolingual approaches. For anchor translation, we utilize a translation resource constructed from Wikipedia itself in addition to exploring a number of methods that have been widely used for short phrase translation. Our formal runs performed very competitively compared to other participants’ systems. Our system came first in the English2-Chinese and the English-2-Korean F2F with manual assessment and A2F with Wikipedia ground truth assessment evaluations using Mean-Average-Precision (MAP) measure.
منابع مشابه
UKP at CrossLink2: CJK-to-English Subtasks
This paper describes UKP’s participation in the cross-lingual link discovery task at NTCIR-10 (CrossLink2). The task addressed in our work is to find valid anchor texts from a Chinese, Japanese, and Korean (CJK) Wikipedia page and retrieve the corresponding target Wiki pages in the English language. The CrossLink framework was developed based on our previous CrossLink system that works on the o...
متن کاملAutomated Cross-lingual Link Discovery in Wikipedia
At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-l...
متن کاملKECIR at NTCIR-10 Cross-Lingual Link Discovery Task
This paper presents the methods of KECIR at NTCIR-10 Cross-Lingual Link Discovery Task. Two architectures of systems are designed, both of which consist of three common modules such as anchor detection, anchor translation and link discovery. In KECIR A2F C2E 03 FSCLIR and KECIR A2F C2E 04 FSCLIR, monolingual link discovery module is considered. In order to detect anchor, feature selection metho...
متن کاملNTHU at NTCIR-10 CrossLink-2: An Approach toward Semantic Features
This paper describes the approaches of NTHU in the NTCIR-10 Cross-Lingual Link Discovery task, also named CrossLink-2. In this task, we aim to discover valuable anchors in Chinese, Japanese or Korean (CJK) articles and to link these anchors to related English Wikipedia pages. To achieve the objective, we do not only depend on Wikipedia’s distinguishing features (e.g. anchor links information an...
متن کاملOverview of the NTCIR-10 Cross-Lingual Link Discovery Task
This paper presents an overview of NTCIR-10 Cross-lingual Link Discovery (CrossLink-2) task. For the task, we continued using the evaluation framework developed for the NTCIR-9 CrossLink-1 task. Overall, recommended links were evaluated at two levels (file-to-file and anchor-to-file); and system performance was evaluated with metrics: LMAP, R-Prec and P@N.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011